Search CORE

253 research outputs found

A Bernstein-Von Mises Theorem for discrete probability distributions

Author: Boucheron S.
Gassiat E.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2009
Field of study

We investigate the asymptotic normality of the posterior distribution in the discrete setting, when model dimension increases with sample size. We consider a probability mass function

\theta_0

on \mathbbm{N}\setminus \{0\} and a sequence of truncation levels

(k_n)_n

satisfying

k_n^3\leq n\inf_{i\leq k_n}\theta_0(i).

Let

\hat{\theta}

denote the maximum likelihood estimate of

(\theta_0(i))_{i\leq k_n}

and let

\Delta_n(\theta_0)

denote the

k_n

-dimensional vector which

i

-th coordinate is defined by \sqrt{n} (\hat{\theta}_n(i)-\theta_0(i)) for

1\leq i\leq k_n.

We check that under mild conditions on

\theta_0

and on the sequence of prior probabilities on the

k_n

-dimensional simplices, after centering and rescaling, the variation distance between the posterior distribution recentered around

\hat{\theta}_n

and rescaled by

\sqrt{n}

and the

k_n

-dimensional Gaussian distribution

\mathcal{N}(\Delta_n(\theta_0),I^{-1}(\theta_0))

converges in probability to

0.

This theorem can be used to prove the asymptotic normality of Bayesian estimators of Shannon and R\'{e}nyi entropies. The proofs are based on concentration inequalities for centered and non-centered Chi-square (Pearson) statistics. The latter allow to establish posterior concentration rates with respect to Fisher distance rather than with respect to the Hellinger distance as it is commonplace in non-parametric Bayesian statistics.Comment: Published in at http://dx.doi.org/10.1214/08-EJS262 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Hal-Diderot

Sustained remission of Multicentric Castleman Disease in children treated with tocilizumab

Author: Boucheron A
Galeotti C
Guillaume S
Koné-Paut I
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Learning with Biased Complementary Labels

Author: B Han
J Read
M Mohri
O Russakovsky
PA Gagniuc
PL Bartlett
PL Bartlett
R Wang
S Boucheron
T Cour
T Liu
T Zhang
V Vapnik
Y LeCun
Publication venue
Publication date: 07/08/2018
Field of study

In this paper, we study the classification problem in which we have access to easily obtainable surrogate for true labels, namely complementary labels, which specify classes that observations do \textbf{not} belong to. Let

Y

and

\bar{Y}

be the true and complementary labels, respectively. We first model the annotation of complementary labels via transition probabilities

P(\bar{Y}=i|Y=j), i\neq j\in\{1,\cdots,c\}

, where

c

is the number of classes. Previous methods implicitly assume that

P(\bar{Y}=i|Y=j), \forall i\neq j

, are identical, which is not true in practice because humans are biased toward their own experience. For example, as shown in Figure 1, if an annotator is more familiar with monkeys than prairie dogs when providing complementary labels for meerkats, she is more likely to employ "monkey" as a complementary label. We therefore reason that the transition probabilities will be different. In this paper, we propose a framework that contributes three main innovations to learning with \textbf{biased} complementary labels: (1) It estimates transition probabilities with no bias. (2) It provides a general method to modify traditional loss functions and extends standard deep neural network classifiers to learn with biased complementary labels. (3) It theoretically ensures that the classifier learned with complementary labels converges to the optimal one learned with true labels. Comprehensive experiments on several benchmark datasets validate the superiority of our method to current state-of-the-art methods.Comment: ECCV 2018 Ora

arXiv.org e-Print Archive

Crossref

Structured Random Matrices

Author: F. Lust-Piquard
G. Pisier
G.W. Anderson
L. Mackey
M. Rudelson
M. Talagrand
N. Srivastava
N. Tomczak-Jaegermann
S. Boucheron
T. Tao
Y. Gordon
Y. Seginer
Z.D. Bai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/10/2016
Field of study

Random matrix theory is a well-developed area of probability theory that has numerous connections with other areas of mathematics and its applications. Much of the literature in this area is concerned with matrices that possess many exact or approximate symmetries, such as matrices with i.i.d. entries, for which precise analytic results and limit theorems are available. Much less well understood are matrices that are endowed with an arbitrary structure, such as sparse Wigner matrices or matrices whose entries possess a given variance pattern. The challenge in investigating such structured random matrices is to understand how the given structure of the matrix is reflected in its spectral properties. This chapter reviews a number of recent results, methods, and open problems in this direction, with a particular emphasis on sharp spectral norm inequalities for Gaussian random matrices.Comment: 46 pages; to appear in IMA Volume "Discrete Structures: Analysis and Applications" (Springer

arXiv.org e-Print Archive

Crossref

PAC-Bayesian Bounds for Randomized Empirical Risk Minimizers

Author: A. Tsybakov
C. Cortes
D. A. McAllester
D. A. McAllester
E. Mammen
J. H. Friedman
J. Rissanen
J.-Y. Audibert
L. Devroye
P. Alquier
R. Schapire
S. Boucheron
T. Zhang
W. Hoeffding
Publication venue: 'Allerton Press'
Publication date: 01/01/2008
Field of study

The aim of this paper is to generalize the PAC-Bayesian theorems proved by Catoni in the classification setting to more general problems of statistical inference. We show how to control the deviations of the risk of randomized estimators. A particular attention is paid to randomized estimators drawn in a small neighborhood of classical estimators, whose study leads to control the risk of the latter. These results allow to bound the risk of very general estimation procedures, as well as to perform model selection

arXiv.org e-Print Archive

Crossref

Hal-Diderot

HAL-Polytechnique

Convex recovery of a structured signal from independent random linear measurements

Author: D. Amelunxen
D. Gross
E.J. Candès
G. Pisier
G.A. Watson
K.R. Davidson
M. Ledoux
M. Rudelson
M. Sion
M. Stojnic
R. Balan
R. Foygel
R. Latała
R. Vershynin
R.T. Rockafellar
R.Y. Chen
S. Boucheron
S. Mendelson
S. Mendelson
V. Chandrasekaran
Y. Gordon
Publication venue
Publication date: 03/12/2014
Field of study

This chapter develops a theoretical analysis of the convex programming method for recovering a structured signal from independent random linear measurements. This technique delivers bounds for the sampling complexity that are similar with recent results for standard Gaussian measurements, but the argument applies to a much wider class of measurement ensembles. To demonstrate the power of this approach, the paper presents a short analysis of phase retrieval by trace-norm minimization. The key technical tool is a framework, due to Mendelson and coauthors, for bounding a nonnegative empirical process.Comment: 18 pages, 1 figure. To appear in "Sampling Theory, a Renaissance." v2: minor corrections. v3: updated citations and increased emphasis on Mendelson's contribution

arXiv.org e-Print Archive

CiteSeerX

Crossref

Caltech Authors

Mirror Descent and Convex Optimization Problems With Non-Smooth Inequality Constraints

Author: A Beck
A Beck
A Ben-Tal
A Juditsky
A Juditsky
A Nedic
A Nemirovski
A Nemirovskii
A Nemirovsky
B Polyak
B Polyak
L Xiao
NZ Shor
S Boucheron
Y Nesterov
Y Nesterov
Yurii Nesterov
Publication venue
Publication date: 29/01/2018
Field of study

We consider the problem of minimization of a convex function on a simple set with convex non-smooth inequality constraint and describe first-order methods to solve such problems in different situations: smooth or non-smooth objective function; convex or strongly convex objective and constraint; deterministic or randomized information about the objective and constraint. We hope that it is convenient for a reader to have all the methods for different settings in one place. Described methods are based on Mirror Descent algorithm and switching subgradient scheme. One of our focus is to propose, for the listed different settings, a Mirror Descent with adaptive stepsizes and adaptive stopping rule. This means that neither stepsize nor stopping rule require to know the Lipschitz constant of the objective or constraint. We also construct Mirror Descent for problems with objective function, which is not Lipschitz continuous, e.g. is a quadratic function. Besides that, we address the problem of recovering the solution of the dual problem

arXiv.org e-Print Archive

Crossref

Sparsity and Incoherence in Compressive Sampling

Author: Baraniuk R G Davenport M DeVore R Wakin M
Boucheron S
Emmanuel Candès
Justin Romberg
Ledoux M
Ledoux M
Mallat S
Massart P
Rudelson M Vershynin R
Takhar D Laska J N Wakin M B Duarte M F Baron D Sarvotham S Kelly K F Baraniuk R G
Tropp J A
Tropp J Gilbert A
Publication venue: 'IOP Publishing'
Publication date: 01/01/2006
Field of study

We consider the problem of reconstructing a sparse signal

x^0\in\R^n

from a limited number of linear measurements. Given

m

randomly selected samples of

U x^0

, where

U

is an orthonormal matrix, we show that

\ell_1

minimization recovers

x^0

exactly when the number of measurements exceeds

m\geq \mathrm{Const}\cdot\mu^2(U)\cdot S\cdot\log n,

where

S

is the number of nonzero components in

x^0

, and

\mu

is the largest entry in

U

properly normalized:

\mu(U) = \sqrt{n} \cdot \max_{k,j} |U_{k,j}|

. The smaller

\mu

, the fewer samples needed. The result holds for ``most'' sparse signals

x^0

supported on a fixed (but arbitrary) set

T

. Given

T

, if the sign of

x^0

for each nonzero entry on

T

and the observed values of

Ux^0

are drawn at random, the signal is recovered with overwhelming probability. Moreover, there is a sense in which this is nearly optimal since any method succeeding with the same probability would require just about this many samples

arXiv.org e-Print Archive

CiteSeerX

Crossref

Caltech Authors

Some inequalities on generalized entropies

Author: A El-Barakaty
A Rényi
C Tsallis
C Tsallis
C Tsallis
C Tsallis
DI Cartwright
EW Weisstein
F Kittaneh
FC Mitroi
Flavia-Corina Mitroi
I Csiszár
I Csiszár
I Csiszár
J Aczél
JM Aldaz
JM Aldaz
L-H Sun
M Masi
M Sebawe Abdalla
N Minculete
N Minculete
N Minculete
Nicuşor Minculete
S Boucheron
S Dragomir
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
S Furuichi
Shigeru Furuichi
T Han
TM Cover
Z Daróczy
Z Ficek
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/12/2011
Field of study

We give several inequalities on generalized entropies involving Tsallis entropies, using some inequalities obtained by improvements of Young's inequality. We also give a generalized Han's inequality.Comment: 15 page

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector